Alhaji Gwani

Conference 2024 Poster

Poster Title

Machine Learning-Based Approach to Modeling Diabetes Risk Factors

Authors and Affiliations

Alhaji Abdullahi Gwani1,2, Siok Kun Sek1, Abbas Umar Farouk3, Usman Waziri2, Ali B. Adamu3
1. School of Mathematical Sciences, Universiti Sains Malaysia, 11800 Minden, Penang, Malaysia.
2. Department of Mathematical Sciences, Faculty of Science, Bauchi State University Gadau, Gadau 751105 Nigeria.
3. Department of Mathematical, Abubakar Tafawa Balewa Bauchi, Bauchi University, Bauchi, Nigeria.

Abstract

Background

Diabetes is characterized by elevated levels of blood glucose, often known as hyperglycemia, which is caused by either insulin resistance or insufficient insulin synthesis. The pancreas releases the hormone insulin, which facilitates the regulation of blood sugar levels and allows cells to utilize glucose as an energy substrate. Globally, the incidence of diabetes has surged. The rise in numbers can be attributed to sedentary lifestyles, bad dietary patterns, and genetic predispositions. The International Diabetes Federation predicted that over 537 million people worldwide would be diagnosed with diabetes in 2021. The exorbitant cost associated with managing diabetes and its repercussions imposes a significant economic burden on healthcare systems globally. To address the growing prevalence of diabetes, public health efforts strive to enhance knowledge about the condition, expedite early detection, and implement preventive measures. The costs of diabetes to society are immeasurable and immense, resulting in decreased productivity, premature death, and the intangible burden of a lower quality of life. Risk stratification is crucial for identifying and managing individuals who are at an elevated risk of developing diabetes.

Methods

Python was used to examine data. Online sources provided secondary data for 768 people. Insulin, blood pressure, SkinT, BMI, age, pregnancies, DB_status, and DB_pred were included. The second data set from a Nigerian medical center measures BP systolic, diastolic, insulin, job, DB, age, gender, height, and weight. The two datasets were used to train and assess machine learning algorithms to identify diabetes risk factors. Patient confidentiality, data security, and ethical standards were met. The diabetes risk models used categorization and regression tree (CART), logistic regression (LR), random forest (RF), support vector machine (SVM), and gradient boosting machine (GBM). ROC, specificity, sensitivity, positive predictive value, negative predictive value, and precision-recall curve area were used to evaluate the model. Classifiers and Shapley additive explanation were used to assess variable relevance.

Results

Machine learning models were used to assess the risky variables that drive hyperglycemia. The most significant variables identified in all modeling approaches were age, pregnancies, BMI, glucose, skin thickness, insulin, and hypertension. Through our research, we utilized machine learning techniques to detect interactive risk markers that hold substantial importance in predicting risks, which were not previously incorporated in existing risk prediction methodologies. Machine learning algorithms have demonstrated their ability to accurately forecast the risk of diabetes, offering significant insights into the factors that contribute to disease risk without making any assumptions about causality.

Conclusions

Modeling diabetes outcomes with machine learning is promising. These include detecting pancreatic cancer, predicting gestational diabetes, identifying obesity risk factors, evaluating stroke recurrence trends, predicting heart failure in diabetics, forecasting diabetic kidney disease progression, analyzing the relationship between blood urea nitrogen levels and cardiovascular diseases and mortality rates, and assessing the impact of air pollution on hospital admissions. The study found that harnessing complicated relationships among interaction risk factors with machine learning enhances diabetes prediction. The proposed method can be used in healthcare to identify susceptible individuals who would benefit most from preventative intervention and minimize wasteful treatment for low-risk individuals.